144 research outputs found

    A Fast Integrated Planning and Control Framework for Autonomous Driving via Imitation Learning

    Full text link
    For safe and efficient planning and control in autonomous driving, we need a driving policy which can achieve desirable driving quality in long-term horizon with guaranteed safety and feasibility. Optimization-based approaches, such as Model Predictive Control (MPC), can provide such optimal policies, but their computational complexity is generally unacceptable for real-time implementation. To address this problem, we propose a fast integrated planning and control framework that combines learning- and optimization-based approaches in a two-layer hierarchical structure. The first layer, defined as the "policy layer", is established by a neural network which learns the long-term optimal driving policy generated by MPC. The second layer, called the "execution layer", is a short-term optimization-based controller that tracks the reference trajecotries given by the "policy layer" with guaranteed short-term safety and feasibility. Moreover, with efficient and highly-representative features, a small-size neural network is sufficient in the "policy layer" to handle many complicated driving scenarios. This renders online imitation learning with Dataset Aggregation (DAgger) so that the performance of the "policy layer" can be improved rapidly and continuously online. Several exampled driving scenarios are demonstrated to verify the effectiveness and efficiency of the proposed framework

    Bounded Risk-Sensitive Markov Games: Forward Policy Design and Inverse Reward Learning with Iterative Reasoning and Cumulative Prospect Theory

    Full text link
    Classical game-theoretic approaches for multi-agent systems in both the forward policy design problem and the inverse reward learning problem often make strong rationality assumptions: agents perfectly maximize expected utilities under uncertainties. Such assumptions, however, substantially mismatch with observed humans' behaviors such as satisficing with sub-optimal, risk-seeking, and loss-aversion decisions. In this paper, we investigate the problem of bounded risk-sensitive Markov Game (BRSMG) and its inverse reward learning problem for modeling human realistic behaviors and learning human behavioral models. Drawing on iterative reasoning models and cumulative prospect theory, we embrace that humans have bounded intelligence and maximize risk-sensitive utilities in BRSMGs. Convergence analysis for both the forward policy design and the inverse reward learning problems are established under the BRSMG framework. We validate the proposed forward policy design and inverse reward learning algorithms in a navigation scenario. The results show that the behaviors of agents demonstrate both risk-averse and risk-seeking characteristics. Moreover, in the inverse reward learning task, the proposed bounded risk-sensitive inverse learning algorithm outperforms a baseline risk-neutral inverse learning algorithm by effectively recovering not only more accurate reward values but also the intelligence levels and the risk-measure parameters given demonstrations of agents' interactive behaviors.Comment: Accepted by 2021 AAAI Conference on Artificial Intelligenc

    Investigation of plasticity in somatosensory processing following early life adverse events or nerve injury

    Get PDF
    Chronic hypersensitive pain states can become established following sustained, repeated or earlier noxious stimuli and are notably difficult to treat, especially in cases where nerve injury contributes to the trauma. A key underlying reason is that a variety of plastic changes occur in the central nervous system (CNS) at spinal and potentially also supraspinal levels to upregulate functional activity in pain processing pathways. A major component of these changes is the enhanced function of excitatory amino acid receptors and related signalling pathways. Here we utilised rodent models of neuropathic and inflammatory pain to investigate whether evidence could be found for lasting hypersensitivity following neonatal (or adult) noxious stimuli, in terms of programming hyper-responsiveness to subsequent noxious stimuli, and whether we could identify underlying biochemical mechanisms. We found that neonatal (postnatal day 8, P8) nerve injury induced either long lasting mechanical allodynia or shorter lasting allodynia that nonetheless was associated with hyper-responsiveness to a subsequent noxious formalin stimulus at P42 despite recovery of normal mechanical thresholds. By developing a new micro-scale method for preparation of postsynaptic densities (PSD) from appropriate spinal cord quadrants we were able to show increased formalin-induced trafficking of GluA1- containing AMPA receptors into the PSD of animals that had received (and apparently recovered from) nerve injury at P8. This was associated with increased activation of ERK MAP kinase (a known mediator of GluA1 translocation) and increased expression of the ERK pathway regulator, Sos-1. Synaptic insertion of GluA1, as well as its interaction with a key partner protein 4.1N, was also seen in adults during a nerve injury-induced hypersensitive pain state. Further experiments were carried out to develop and optimise a new technological platform enabling fluorometric assessment of Ca2+ and membrane potential responses of acutely isolated CNS tissue; 30-100 μm tissue segments, synaptoneurosomes (synaptic entities comprising sealed and apposed pre- and postsynaptic elements) and 150 × 150 μm microslices. After extensive trials, specialised conditions were found that produced viable preparations, which could consistently deliver dynamic functional responses. Responsiveness of these new preparations to metabotropic and ionotropic receptor stimuli as well as nociceptive afferent stimulant agents was characterised in frontal cortex and spinal cord. These studies have provided new opportunities for assessment of plasticity in pain processing (and other) pathways in the CNS at the interface of in vivo and in vitro techniques. They allow for the first time, valuable approaches such as microscale measurement of synaptic insertion of GluA1 AMPA receptor subunits and ex vivo assessment of dynamic receptor-mediated Ca2+ and membrane potential responses
    • …
    corecore